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INFLUENZA STUDIES. 

I. ON CERTAIN GENERAL STATISTICAL ASPECTS OF THE 1918 EPIDEMIC 

IN AMERICAN CITIES. 1 

By Raymond Pearl, Ph. D., Professor of Biometry and Vital Statistics, School of Hygiene and Public 
Health, Johns Hopldns University; Consultant in Vital Statistics and Epidemiology, United States 
Public Health Service. 

I. Introduction. 

The pandemic of influenza which swept over the world in 1918 
was the most severe outbreak of this disease which has ever been 
known, and it takes an unpleasantly high rank in the roster of epi- 
demics generally. It is certainly impossible now, and perhaps always 
will be, to make any precise statement of the number of people who 
lost their lives because of this epidemic. But it is certain that the 
total is an appalling one. Undoubtedly a great many more people 
died from this cause than from all causes directly connected with 
the military operations of the Great War. In the United States 
alone conservative estimates place the deaths from the influenza 
epidemic at not less than 550,000, which is approximately five times 
the number (111,179) of American soldiers officially stated 2 to 
have lost their lives from all causes in the war. And the end of the 
epidemic is by no means yet reached. In England and Wales the 
curve of mortality from influenza was even in 1907, seventeen years 
after the epidemic of 1890, higher than it was in any of the 40 years 
preceding 1890. The decline in the mortality rate after the 1848 
epidemic in Great Britain was similarly slow. 3 There is no evident 
reason to suppose that conditions following the first explosion of 
this present epidemic will be essentially different from those which 
obtained in the earlier cases. 

For two reasons the hygienist and epidemiologist should be 
interested in the intensive study, from every possible angle, of the 
present pandemic. In the first place, owing to the advances which 
have been made in every branch of medical science since the epi- 

1 Papers from the Department of Biometry and Vital Statistics, School of Hygiene and Public Health, 
Johns Hopkins University, No. 5. This investigation was carried on in consultation with the United 
(States Public Health Service, Office of Field Investigations on Influenza, Dr. W. H. Frost, surgeon in charge. 

» As of date Apr. 30, 1919. 

3Cf. Article on "Influenza" in Encyclopedia Britannica, 11th Edition, for a conveniently accessiblo 
verification of these statements. 
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demic of 1890, there is now available a much more adequate investi- 
gational armament with which to attack the problems raised by such 
an epidemic than was the case earlier. Furthermore, the whole 
machinery for getting accurate records of the incidence and results 
of the outbreak are much better now than they were 30 years ago. 
This is particularly true in the United States. The records of mor- 
tality connected with the present epidemic are unquestionably more 
complete and accurate than any that have ever before been avail- 
able in this country for any epidemic of anything like so great extent 
or force. 

In the second place, the very magnitude of this epidemic is in 
itself a challenge to the whole medical profession. The hygieniste 
of the world are the standing army, which is, in theory at least, 
maintained by society to organize and hold the defenses against 
such dread invaders as these. Such a blow as the present one may 
well inspire a slogan like that which saved Verdun, "Es ne passeront 
pas." If every epidemiologist does not take advantage of the 
present opportunity to investigate with all possible thoroughness 
epidemic influenza, to the end of making a better defense next time, 
he will have been derelict in his plain duty. 

The present paper is intended as a first contribution toward the 
statistical analysis of certain phases of the 1918 influenza epidemic. 
It will be followed by further papers in the same series dealing with 
other aspects of the problem. In the first studies in the series 
attention will be confined entirely to the mortality records of some 
forty of the larger cities of the United States. The reason for this 
limitation to mortality only and to large cities is that accurate and 
reliable data within these limitations are now available, and the same 
can not be said of morbidity records, on anything like so general a 
scale. Later it is expected that sufficiently accurate and extensive 
morbidity statistics of the epidemic to warrant statistical analysis 
will be available. 

The data of this study are taken primarily from the Weekly Health 
Index. 1 On account of varying medical opinions as to the properly 
reportable terminal cause of death of persons dying after having had 
influenza during this epidemic, it has been thought safest to use 
death rates from all causes for study, rather than those specifically 
reported to the registrar as due to influenza or pneumonia. Conse- 
quently, we shall deal with death rates from all causes in discussing 
the present epidemic. This makes no practical difference in the 
statistical results, because the deviation of the curves of total mor- 
tality from their normal course during the epidemic was due entirely 
to causes inherently associated with the epidemic itself. The use 
of the death rate from all causes during the epidemic has the fur- 

i A typewritten publieationissued weekly by the Bureau of the Census, and compiled under the direction 
of Dr. W. H. Davis, Chief for Vital Statistics. 
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ther advantage that it takes into account those deaths which occur 
from diseases of the heart or kidneys some weeks or months after an 
attack of influenza from which the patient has apparently recovered, 
but which in reality are responsible for the fatal break-down of a 
part of the organic machinery which had long been weak, and only 
required for its complete collapse some such strain as the attack of 
influenza superimposed. 

The general problem with which the first study in this series will 
have to do is that of the statistical analysis of the first explosive 
outbreak of epidemic mortality in large Ame ican cities. As will 
presently appear, there was an extraordinary degree of variation 
amongst the different cities in respect of the initial force and duration 
of this first explosion. These differences between cities in respect of 
the severity and suddenness with which they were attacked by the 
disease constitute the first great problem which the epidemic has 
raised. What factors had a causal influence in determining this 
great observed variation among cities? The full significance of this 
problem will be apparent when the facts of variation in force of 
explosive outbreak are before us. The first task of this study is to 
present the data in such a manner as to bring out the real extent and 
magnitude of the variation in the epidemic. 

I am indebted to Mr. John Rice Miner for the greater portion of 
the laborious arithmetic connected with this investigation. 

II. General Survey of the Mortality Curves. 

In order to get in hand the general problem it is desirable to examine 
with some care the mortality by weeks in each of the cities dealt with. 
To this end Figures 1 to 6 have been prepared. On these diagrams 
are plotted, for each city, the annual death rates per 1,000 population 
from all causes, for each week, the data being those of the Weekly 
Health Index. The plotting is done on a logarithmic scale of ordi- 
nates (rates) and an arithmetic scale of abscissas (weeks). 1 The 
curves begin with the week ended July 6, 1918, and continue to 1919. 
The scale is the same for all diagrams, though different combinations 
of parts of the logarithmic "decks" are used in certain cases in order 
to fit the diagrams to the page. 

Anyone examining these curves thus collected together on a uni- 
form scale for comparison can not fail to be impressed by the fact 
that there is an extraordinary amount of difference between different 
cities in respect of the force with which they were struck by the 
epidemic at its initial outbreak. Compare, for example, the Albany, 
Boston, Baltimore, Dayton, or Philadelphia curves with those for 
Atlanta, Indianapolis, Grand Rapids, Milwaukee, or Minneapolis. 
The former curves show an initial sudden explosive outbreak of great 

i For a discussion of the advantages oi "arithlog" paper see Fisher, I. "The 'Ratio' Chart forplotting 
Statistics." • Quarterly Publications Amer. Stat. Assoc, 1917, pp. 577-601. 
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Figure 1.— Annual death rates, by weeks, per 1,000 population, for 8 cities. 
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Fig. 3.— Annual death rates, by weeks, per 1,000 population, tor 8 cities. 
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force, while the latter exhibit a much slower and milder increase of 
the mortality rate. 

In some cases the curve of the first epidemic outbreak rises to the 
peak (ascending limb) and declines from the peak (descending limb) 
at about the same rate. This condition of affairs is exemplified in 
the Albany and Baltimore curves, to mention but two. In other 
cases the rate of ascent to the peak is very rapid while the decline is 
slow and long drawn out. 
Such a condition is shown 
in the curves for Cleveland 
or St. Paul. 

Some of the cities, such 
as Albany, show but a sin- 
gle well-defined peak in the 
mortality curve. Many 
show two peaks. Boston, 
New Orleans, and San 
Francisco give beautifully 
typical curves of this sort. 
Finally, a few of the cities 
show three well-marked 
peaks. Louisville is a good 
example of the latter class. 

In most cases the first 
peak was the highest and 
the second and third were 
progressively lower. This 
was not true in all cases, 
however. Milwaukee and 
St. Louis showed second 
peaks higher than the first. 
The wave-like character of 
the curves in general is of 
great interest. The usual 
phenomenon was a large 
first wave followed by a series of other smaller ones. This general char- 
acteristic of the curves is so pronounced and definite that any epidemi- 
ological theory which is to be at all adequate must take account of it. 

It is evident from general inspection of these curves that there is a 
strong justification for taking, as the first general problem in con- 
nection with this outbreak of influenza, the significant causal factors 
concerned in bringing about this observed differentiation between the 
different cities in respect of the form of the epidemic mortality curves. 
The extent and definiteness of the differences between the several 
curves indicate that there must be discoverable clean-cut differen- 
tiating factors which influenced the influenza death rates. 
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III. Classification of the Data. 

As a first step in the analysis it is desirable to make certain rough 
classifications of the facts brought out by the mortality curves. To 
this end Table I has been prepared. In this table are set forth the 
following data regarding each of the cities : 

1. The highest peak death rate attained in any week of the epi- 
demic up to March 29, 1919. 

2. The date l on which the highest peak rate was reached. 

3. The number of distinct peaks exhibited by the mortality curve 
within the time period here studied. These different peaks indicate 
recrudescences or waves of the epidemic. 

4. The date at which the second peak in the mortality curve oc- 
curred, in the case of those cities showing 2 or more peaks. 

5. The number of weeks elapsing between the first peak and the 
second. 

6. The date at which the third definite peak, if any, occurred in the 
mortality curve. 

7. The number of weeks elapsing between the second peak and the 
third. 

3. The number of weeks during which the mortality rate was 
higher than it had been at any time between the week ended July 6, 
1918, and the beginning of the epidemic. The range of fluctuation 
of the weekly annual death rate in the period from July to the end 
of September was held to be sufficiently accurate indication of the 
normal range of fluctuation of the death rate in any particular city. 

9. The number of weeks elapsing from the beginning of epidemic 
mortality to the highest peak of the curve. This gives a measure 
of the time factor on the ascending side of the epidemic explosion. 

10. The number of weeks elapsing from the time of the highest 
peak, of the mortality curve to the time when the curve came again 
within the normal range of fluctuation. This gives the time factor 
on tne descending limb of the epidemic outbreak. 

11. The excess mortality rate, over the normal for the same season 
of the year for the same places, for the 25 weeks between September 
8, 1918, and March 1, 1919. These figures were issued as a supple- 
ment to the Weekly Health Index by the Census Bureau. 2 

From this table a number of points present themselves for discus- 
sion. They may best be taken up in separate sections, in order of 
the successive rubrics of the table. 

1. Maximum -peak death rates. — The highest or maximum peak rate 
of mortality during the epidemic varied greatly, having ranged from 



« It is to be understood that all -dates here and throughout are as of "weeks ended" on the specified 
date. The original statistics arc given only in weeks and hence any finer time differentiation is impossible. 
2 Of. Public Health Reports, vol. 3<t, No. 11, p. 505, 1919. 
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31.6 in the case of Grand Rapids, Mich., to 158.3 in the case of Phila- 
delphia. 

The distribution of the different maximum peak rates over this 
range is shown in detail in Table II. 

Table II. — Showing the frequency of occurrence, of different maximum peak death rates 

during the epidemic. 



Maximum peak rates. 


Number of 
cities. 


40.0- 49.9 


6 

4j 

51 

5j 
51 
4 
21 

5J 

1 




koi 
HoJ 

> 
> 


60.0- 69.9 


60.0- 69.9 


70.0- 79.9 


80.0- 89.9 


90.0- 99.9 


100.0-109.9 


110.0-119.9 


120.0-129.9 


140.0-149.9 


150.0-159.9 


1 


Total. 


40 









From Table II it appears that in the 40 cities considered the peak 
rates which were of the most frequent occurrence were, generally- 
speaking, rates below 70. Twenty out of the 40 fell below that 
figure. Only 9 out of the 40 cities showed a maximum peak rate of 
100 or more. Up to a maximum peak rate of 70 the distribution is 
very even in the four classes of 10 points each in the rate. From 
70 on it falls off rapidly, with the single exception of the class of 
rate from 100 to 109.9, which has a frequency of 5. 

The detailed distribution of the maximum peak rate is shown 
graphically in Figure 7. 

Table III. — Constants for maximum peak death rates. 



Constant. 


Value. 




73.9±3.2 
70.0±4.O 
30.3±2.3 









Three of the cities, Milwaukee, Kansas City, and St. Louis, show 
higher maximum peak rates on the second wave than on the first. 

2. Date of occurrence of maximum -pealc rate. — The date of the 
week in which the maximum peak rate occurred is given in the third 
column of Table I. It will be seen that the earliest date, October 5, 
occurs but twice, namely, in Boston and Cambridge. These two 
cities, of course, are in a demographic sense practically a single unit 
though politically separate. At the other extreme the latest maxi- 
mum peak rate date is December 14. The cities showing a culmina- 
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tion of the epidemic mortality during the week which ended on this 
latter date are Grand Rapids, Milwaukee, and St. Louis. Grand 
Rapids has an extremely peculiar curve, unlike that of any other 
city in the country. Milwaukee and St. Louis are two of the cities 
showing the second peak higher than the first, so in these two cases 
the date in the third column of Table I refers to the second peak, 
while in all other cities it refers to the first peak. On these accounts 
the upper range end for maximum peak date should probably not 
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Fig. 7.— Distribution of maximum peak death rates in 40 cities. Certain constants of the distribution 
shown in Table II are exhibited in Table III. 

be taken as December 14, but as November 2, since the only other 
later date, November 16, appears in a single case, St. Paul, and the 
curve for that city is again abnormal. There are five cities showing 
the peak of the mortality curve in the week ended November 2, 
namely, Cleveland, Los Angeles, Oakland, Pittsburgh, and San 
Francisco. 

The distribution of maximum peak dates is shown in Table IV, 
and graphically in Figure 8. 



August 8, 1919. 1756 

Table IV. — Distribution of dates of maximum peak mortality. 



<0 






Maximum peak in week 
ended— 


Number of 
cities. 




0/ x ' 

0/ u 
3 








November 2 














Total 


40 
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Fig. 8.— Distribution olpeak dates otthe epidemic. 

Using all the data, we find the following constants for date of 
maximum peak. 
Moan peak date = October 23 ± 1.68 days. 
Standard deviation in peak date = 15.75 ± 1.19 days. 
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These constants will serve as a useful record of the time factor in 
the epidemic of the autumn of 1918 in American cities. 

Thirty-one out of the 40 cities had attained the peak rate of mor- 
tality prior to November 2. 

3. Number of peaks in mortality curve. — It is clear from the dia- 
grams already shown that there was considerable variation in the 
different cities in respect of the number of epidemic mortality peaks 
exhibited. 

The details on this point are shown in Table I. Putting the data 
together in the form of a frequency distribution we have the results 
shown in Table V. 



Table V.- 



-Showing number of distinct peaks in mortality curve from the beginning of the 
epidemic to Apr. 1, 1919. 



Number of distinct peaks. 


Number 
of cities. 


Per cent 
of cities. 


1 


6 
26 

8 


15 
65 
20 


2 


3 


Total 


40 


100 





Thus it is seen that 26, or 65 per cent, of the 40 cities showed two 
distinct peaks in the mortality curve, while 6, or 15 per cent, had one 
peak, and 8, or 20 per cent, had three peaks. The diminishing wave- 
like character of the successive peaks is clearly shown in the diagrams. 

4. Dates of second and third pealcs of mortality. —In the case of cities 
having two or three peaks the distribution of dates of occurrence of 
the second peak is shown in Table VI. 



Table VI. — Distribution of second-peak dates. 



Week ended— 



Occur- 
rence of 
second 
peak in 
2-ceak 
cities. 



Occur- 
rence of 
second 
peak in 
3-peak 
cities. 



Occur- 
rence of 
second 
peak in 
all 

cities. 



November 30.. 
December 7... 
December 14 . . 
December 21.. 
December 28.. 

January 4 

January 11 

January 18 

January 25 

February 1 



Total.. 



26 



Certain interesting facts stand out clearly from this table. In the 
8 cities which had three distinct peaks of mortality the second peak 
came early— prior to December 28. The distribution for the 26 
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cities having two peaks of mortality is distinctly bimodal, 12 of them 
showing a mode for the week ended December 21, and 14 a mode 
somewhere in the weeks of January 18 and 25. No city had a second 
peak of mortality in the week ended January 11. 

Table VII gives the distribution of dates of the third peak of mor- 
tality. 

Table VII.' — Distribution of third peak dates. 



Week ended — 


Occur- 
rence of 
third 
peak. 




1 
4 
3 




March 22 


Total 


8 





Here the observed mode evidently falls somewhere in the week 
ended March 15. 

The data of Tables VI and VII are shown graphically in Figure 9. 

The figures and diagram at once suggest that the group of 12 two- 
peak cities showing the second peak somewhere between December 
7 and January 4 were cities which at that time were presumably 
destined to show a third distinct wave and peak of mortality, but 
in which for some reason not now apparent the third wave did not 
eventuate. In contradistinction to these stand the 14 cities showing 
a second peak of mortality between January 11 and January 21. 
These latter are presumably cities in which the complex of factors 
determining the form of the mortality curve was such as to lead 
definitely to a two, and only two, peaked curve. This idea will be 
substantiated by further evidence to be presented immediately. 

As a matter of record of the epidemic in American cities, the mean 
dates calculated from Tables VI and VII are given in Table VIII. 

Table VIII. — Constants for dates of second and third mortality peaks. 



Item. 


Mean. 


Standard devia- 
tion. 




Jan. 1± 2.13 days 
92.26 days 

Mar. 14 ± 1.10 

days. 
165.25 days,. 


18.40 ± 1.51 days, 
4.63 ±0.78 days. 


Days from beginning of October to 
second peak. 


Days from beginning of October to 
third peak. 



Putting all the data together we find for the whole group of cities 
the following average relations : 

(a) Days from average date to maximum peak in all cities to second 
peak in cities showing two or three mortality peaks = 69.26. 
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(b) Days from date of second peak, in all cities showing two or 
more peaks, to third peak, in cities having three mortality peaks 
= 72.99. 

These relations seem at first sight to point to a cycle of about 10 
weeks' duration in the secondary mortality waves of this influenza 
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FIG. 9.— Frequency of occurrence of second and third peaks of mortality at different dates. 

epidemic, after the first wave. This point can, however, be more 
accurately discussed by reference to the data set forth in Table I 
on the number of weeks elapsing between the successive peaks. 

These data are presented in the form of frequency distributions 
in Table IX. 

129348°— 19— —2 
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Table IX. — Frequency distributions of number of weels elapsing hetween successive 

mortality ; 



Number of weeks. 


Number of cities. 


Between first and second peak. 


Between 
second 
and third 

peak. 


All cities. 


2-peak 
cities. 


3-peak 
cities. 


6 


3 

4 
6 
3 
1 
4 
2 
7 
2 
2 




3 
2 
2 
1 




7 


2 
4 
2 
1 
4 
2 
7 
2 
2 




8 




9 




10 




11 




1 

2 
2 
2 


12 




13 




14 




15 








1 


Total 








34 . 


26 


8 


8 





From this table it appears clearly that there was a definite ten- 
dency for the two-peak cities to fall into two groups in respect of the 
time elapsing between first and second peaks. About a third of them 
had the second mortality peak around 8 weeks after the first peak. 
The remaining two^thirds had the second peak, on the average, 
about 13 weeks after the first. The three-peak curves had the second 
peak on an average 7.1 ±0.3 weeks after the first, and the third peak 
on an average 13.1 ±0.3 weeks after the second. The cycle in the 
epidemic waves would therefore, appear to be nearly a multiple of 
7 weeks rather than the 10 weeks tentatively deduced from, the dates 
of peaks. There the process of averaging obscured the true relations. 

5. Duration of explosive outbreak, — We may next consider the 
question of the duration in weeks of the, explosive epidemic outbreak. 
The pertinent, data are given in the columns of Table 1 headed 
"Weeks rate was outside normal range," "Weeks, start to peak,3' 
"Weeks, peak to normal rate." In discussing any question of- dura- 
tion of an epidemic outbreak of a disease it is necessary to define 
sharply and usually arbitrarily what are to be taken as limiting 
points. It is always difficult, and usually impossible, to define these 
limiting points precisely and logically so that no one will or can 
criticize their location. The point has recently been discussed by 
Hitchcock and Carey 1 who say: "The difficulty * * * lies in decid- 
ing at just what point an undue prevalence or outbreak becomes epi- 
demic." The general epistemological principle to be observed is 
clearly this : That since it is usually impossible to say with mathe- 
matical precision, in the case of an endemic disease, exactly when 
an epidemic outbreak begins -.or ends one. must; in order to avoid 

i Hitchcock, J. 8. and Carey, B"."W., "A Median Epidemic Index'. Amer. Jour. Public Health, Vol. IX, 
pp. 355-357 .-,1919. 
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unconscious bias in dealing with a series of different localities, lay 
down an arbitrary rule and follow it absolutely. Then the results 
will be correct relative to each other, even though there may be room 
for argument as to whether they are absolutely correct or not. 
Following this principle the following rule was laid down and has 
been used throughout: The epidemic mortality was considered to 
have begun in any city on the date when the mortality curve for that 
citjr first passed outside the range of fluctuation exhibited by the 
curve between the week ended July 6, 1918, and the end of the week 
immediately preceding the epidemic rise of the curve. The mortality 
of the first epidemic outbreak was considered to have ended on the 
date when the curve again passed within the same range of fluctuation. 
This measure of duration is admittedly rough, but I think it suffices 
for a first approximation to the facts. It must be clearly understood 
that the data collected under this definition will not measure the 
duration of the epidemic, with any accuracy at all, for several reasons. 
In the first place, we are dealing in this paper solely with mortality 
and not at all with morbidity. The mortality of an epidemic can 
only begin a definite and significant period of time after the epidemic 
incidence of the disease has begun. In the second place, the arbi- 
trary definition on which we are operating here will include both 
peaks of some 2-peaked curves and only the first peak of others, the 
differentiating factor being of course whether the mortality curve 
dropped down to within the "normal" range between peaks or did 
not. Now while this will seem to some a serious, not to say totally 
invalidating, criticism of the here defined measure of duration of 
first outbreak, I think it really has no weight at all. The facts are 
that in some cities (A) there was a sharp explosive outbreak of epi- 
demic mortality. The death rate curve went up abruptly and 
came down abruptly till it was as low as it was before the 
epidamic outbreak. In other cities (B) the curve went up abruptly 
and came down, but only some part of the way, distinctly not 
reaching so low a rate as prevailed before the epidemic. Now by 
any canons of common sense it would seem clear that in the A 
cities the particular epidemic outbreak about which we are talk- 
ing came to an end when the death rate was again normal for the 
locality and season. Subsequently the death rate may have again 
risen abruptly. But if it did it was a new and distinct epidemic 
outbreak, temporally and spatially related to the first outbreak if 
one likes, but definitely separated from it by a longer or a shorter 
period in which the mortality rate was normal. Conversely in the 
B cities even though the mortality rate did decline from the maximum 
peak rate, 'Still .it did not go back 'to normal, or in other words it 
remained an epidemic mortality, in the common sense of that word. 
The rate after this depression may have risen to a new second peak, 
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but all the time it was part of the same epidemic outbreak. Thus it 
clearly appears that there is a real distinction between the A cities 
and the B cities. This distinction is reflected perfectly in the dura- 
tion definition hero adopted, and would be wholly lost in any scheme 
of measuring duration by peaks alone. It only needs to be kept 
firmly fixed in mind that we are here measuring the length of time 
during which the death rate was higher than the normal death rate 
for the same city, in the first continuous outbreak of influenza 
mortality. 

We may first consider the total number of weeks that the mortality 
was outside the July to September range of fluctuation. The fre- 
quency distribution is given in Table X. 

Table X. — Frequency distribution of cities in respect of number of weeks mortality curve 
was outside "normal" range of fluctuation in first outbreak. 



Weeks. 


Number 
of cities. 


5 


M> 5 

3/ 4 
2/ 6 \ 7 

54. 

2*' 
H2l 

rr 


8 


7 


g 


9 


10 


11 


12 


13 


14 


15 


18 


17 


18 


19 


20 , 


21 


22 


23 


Total 


10 





The range of variation in the duration of the first outbreak of 
epidemic mortality, as here defined, is great, from five weeks on the 
one hand (Kichmond, Va.) to 23 weeks on the other (Atlanta, Ga.). 
So great is this variation that its general trend is not easily compre- 
hended until the figures are somewhat combined. If that is done, 
certain general relations appear. First of all, it is to be noted that 
20 cities, exactly one-half the total number, showed a duration as 
here defined of 10 weeks or less, while in the other half the duration 
was 11 weeks or over. The median duration was then 10.5 weeks. 

In general, the tendency was for the shorter duration to occur 
more frequently. This is well shown by Figure 10, which is plotted 
from the last column of combined figures in Table X. 

Considerably the largest single area in the histogram is the first 
one covering durations of five to eight weeks inclusive. The fre- 
quencies for the longer periods, shown in four-week groups, become 
successively smaller. 
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From the ungrouped data of Table X the following constants have 
been calculated: 

Mean duration of epidemic mortality in the first outbreaks 11.90 ± 
0.55 weeks. 
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Fig. 10.— Frequency of different durations of the first outbreak of epidemic mortality. 

Standard deviation = 5.17 ±0.39 weeks. 

We may next consider the two limbs of the explosive mortality 
curve. The frequency distributions for the time duration of the 
ascending limbs and the descending limbs are given in Table XI. 
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Table XI.— Frequency distributions for two mouties of epidemic mortality curve (first 

outbreak). 



Weeks. 


Frequency. 


Normal 
to peak 
(ascend- 
ing limb). 


Cumu- 
lated fre- 
quency. 


Peak to 
normal 
(descend- 
ing limb). 


Cumu- 
lated fre- 
quency. 




5 
17 
12 


6 
22 
34 
34 
37 
37 
38 
38 
39 
40 
40 
40 
40 
40 
40 
40 
40 
40 






3 


2 

13 
5 
3 
1 


2 
15 
20 
23 
24 
24 
26 
27 
28 
31 
33 
34 
36 
39 
39 
39 
40 


4 


5 


6 


3 


7 


8 


1 


9 


2 
1 
1 
3 
2 
1 
2 
3 


10 


i 

l 


11 


12 , 


13 




14 




15 u 




16. 




17 




18 






19 




1 


Total 




40 




40 









The first point which strikes one from this table is that it, in 
numerical form, confirms what is apparent from inspection of the 
individual curves, namely that (a) the epidemic mortality curve in 
the first outbreak tends in general to ascend to the peak at a more 
rapid rate, or in other words more abruptly than it descends; and (h) 
there is a great deal more variation among the cities in respect of 
the time 'interval covered by the ascending limb of the mortality 
curve than in the time required for the mortality to come from the 
peak rate back to normal. In 34 of the 40 cities it required 4 weeks 
or less tune for the mortality rate to pass- from normal to its epidemic 
peak. But in only half as many (17) of the cities did the rate come 
down from its peak to normal again in a period of 4 weeks or less. 

The constants of the two distributions are as follows: 

Mean time from normal mortality rate to peak = 3.90 ±0.21 weeks. 

Standard deviation in time from normal mortality rate to peak = 
1.93 ±0.15 weeks. 

Mean time from peak mortality rate to normal = 8.00 ±0.50 weeks. 

Standard deviation in time from peak mortality rate to normal = 
4.68 ±0.35 weeks. 

From these figures it appears that on the average it took about 
twice as many w T eeks for the mortality curve to come back from its 
peak condition to the normal again, as were required for the increase 
from normal to peak at the beginning of the explosion. In round 
figures, the ascending limb of the mortality curve occupied about a 
month and the descending limb about two months. 
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The differences between the two distributions of Table XI are 
well shown graphically in Figure, 11, in which the cumulated or 
integral curves are plotted. 

6. Excess mortality. — Early in March, 1919, the Census Bureau 
issued a supplement to its Weekly Health Index showing for 34 of 
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Fig. 11.— Cumulated frequency curves for time covered by (a) ascending limb, and (6) descending limb of 

epidemic mortality curve. 

the 40 cities of Table 1 the mean excess rate of mortality due to the 
epidemic for the period of 25 weeks preceding March 1. These data 
are given in the last column of Table 1. They are arranged in the 
form of a frequency distribution in Table XII. 
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Table XII. — Excess mortality for 25-wcek period. 



Mean excess mortality rate. 


Number 
of cities. 


1-1.9 


1 
6 
6 
i 
9 
3 
4 
1 


2-2.9 


3-3.9 


4-4.9 


5-5.9 


6-6.9 


7-7.9 


8-8.9 


Total 


34 





Considering the small numbers involved, this is a fairly smooth 
unimodal distribution. Half of the cities have excess rates below 
five, and half above. Calculating from the ungrouped material we 
find- 
Mean 25-week excess mortality rate = 4.75 ±0.20. 
Standard deviation in 25-week excess mortality rate = l,76±0.14. 
7. Summary of variation data. — Summarizing, it may be said that 
the purpose of the material so far presented is simply to place in 
orderly array the basio statistical characteristics of the weekly 
mortality curves of the 1918-19 influenza epidemic in American 
cities, to the end that the extraordinarily great and entirely distinct 
differences between different cities in respect of the various charac- 
teristics of the epidemic may be apparent. It is essential to make 
this variation distinctly evident as a preliminary to the analytical 
discussion of its causes. It has been shown clearly that in respect 
of each of the following attributes or characters of the epidemic 
mortality there was a marked variation among the 40 American 
cities studied. 

1. General form of mortality curve. 

2. Maximum peak mortality rate. 

3. Peak dates. 
Number of distinct peaks in mortality curve. 
Time between peaks of mortality. 
Steepness of ascending and descending limbs of mortality 

curve. 

7. Excess mortality rate. 

8. Duration of epidemic mortality. 

The variation among cities in these different epidemiological 
characters constitutes a problem of first-class hygienic interest and 
importance. Why did it exist? Why were not all cities at least 
reasonably alike in their influenza epidemic ? If we can find sound 
and correct, even though only partial, answers to these questions 
we shall have gained greatly in that understanding of the epidemiology 
of influenza which must always underlie any effective control of it. 
It is to the analysis of this problem that attention will next be 
devoted. 



4. 
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IV. Epidemicity Indices. 

With the variation data in hand one further step is necessary 
before the analysis by multiple correlation can be completed. We 
must have a single numerical measure or index of the force of the 
epidemic explosion in any particular place. In the earlier sections 
we have seen that the mortality curves in some cities have a single 
very sharp peak, while in other cases the curve of epidemic mortality 
is a long, low, flat curve. To deal practically with such differences, 
it is essential to. have some single numerical index which will be 
sensitive to changes of any order in the curve, and at the same time 
will measure the essential characteristic which we want to measure 
in an epidemic curve. 

Confining the discussion to mortality solely, it appears to the 
writer that the essential characteristic of an epidemic curve, is that 
the death rate rises with greater or less abruptness above its normal 
level to a peak, more or less pointed, and then declines again to the 
normal level, in a more or less steep or abrupt manner. In such a 
movement of the death rate curve there are two fundamental vari- 
ables, namely, (a) the time during which the mortality departs from 
its normal level, and (b) the extent or degree of departure. If we 
suppose the time (a) made a constant then the extent of departure 
measures the force of epidemic mortality. In general, common sense 
would indicate that any measure of the force of an epidemic, or, in a 
single word, any measure of the epidemicity of a disease must 
properly incorporate both these variables. 

In the discussion of the desiderata of an epidemicity index it will 
help to have some simple diagrams of different types of epidemics. 
For this purpose Figures 12 and 13 are introduced. They are purely 
hypothetical illustrations. 

In each of the two epidemics shown in these diagrams the same 
number of people died and the peak death rate was reached at the 
same time. But clearly the outbreak depicted in Figure 12 would 
be generally regarded as a more severe or explosive epidemic, qua 
epidemic, than the one shown in Figure 13. Such changes of the 
death rate as are shown in Figure 13 may indeed not be regarded as 
epidemic at all. We do not commonly think of the seasonal rise in 
the endemic influenza rate as an epidemic. Yet it is quantitatively 
of the same order as the circumstances depicted in Figure 13. It is 
of the essence of the idea of an epidemic, as commonly held, that it 
should have something of an explosive character — that is, there 
must be a relatively large increase in the death (or morbidity) rate, 
occurring in a relatively short space of time, in order to constitute 
an epidemic. 
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This being so, any proper measure of the degree of epidemicity 
must first of all measure the degree of explosiveness of the outbreak of 
the disease under discussion. There are a number of ways, mathe- 
matically, in which this can be done. The decision as to which is the 
best method will turn upon the degree of sensitiveness with which 
each measures the essentially explosive feature of the outbreak. 
In arriving at a measure of epidemicity for the analytical study of 
the influenza epidemic in American cities five different plans have 
been tried. We may now discuss these different indices, and decide 
upon which is the best for present purposes. The data used are the 
weekly mortality rates for thirty-nine American cities dealt with in 
earlier sections. 

1. Standard deviation of epidemic. — The first epidemicity index 
which would occur to the biometrician is that expressed by the 
standard deviation of the epidemic outbreak, measured in weeks, 
the death rates being regarded as frequencies. An epidemic curve 
like that of Figure 12 obviously has a smaller standard deviation in 
time than one such as is shown in Figure 13. In general, the greater 
the explosiveness of the outbreak the smaller will be the standard 
deviation. Practically the manner in which this index is calculated 
is as follows: 

(a) Take as the basis of calculation the duration of the epidemic 
outbreak as defined earlier. 1 

(6) Within the range so defined calculate the standard deviation 2 
in weeks in the ordinary way, the observed . death rates bemg taken 
as ordinates. 

In the present instance the constant takes this form: Let y 
denote the death rate in a particular, week, and x the deviation of the 
week in which that rate occurred from the mean. Then, if I t denotes 
the epidemicity index, we have 




when N is the number of weeks in the epidemic period, and 2 denotes 
summation. This index is easy to calculate and has a definite physi- 
cal meaning. Practically, it would probably be desirable if I x were 
to be used as an epidemicity index generally, to take some multiple 
of its reciprocal for tabling, since as the index now stands it becomes 
numerically smaller as the explosiveness of the epidemic becomes 
greater. The value lOO/.^ would be satisfactory. 

> Vide p. 1760. 

2 The ' 'standard deviation" is a well-lmown constant used in biomctric work. It is the root-moan- 
square-deviation about the mean. For a detailed discussion of this constant see Yule's ' 'Introduction to 
the Theory of Statistics"? or any of the modern texts on elementary statistical methods, "i 
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2. Variation of excess death rates. — Another measure of epidemi- 
city which may be considered is of a more complex character than 
the last. Its nature may be indicated symbolically as follows: 

Let M = mean death rate during epidemic, the latter being delim- 
ited as to duration by the definition in an earlier section already 
referred to; 

AP=mean death rate in the period from July 6, 1918, to out- 
break of epidemic. 

M" ' = M— M' = increase in mean death rate during epidemic. 

S = V^jty'i where y is the deviation of any particular week's death 

n 

rate from M, and n is the number of weeks in the epidemic period. 
S is the standard deviation of the epidemic death rates, each equally 
weighted. 
Then the second epidemicity index is 

T 1008 

This quantity will increase as the explosiveness of the outbreak 
increases. In ordinary biometric terminology it is the coefficient of 
variation of the weekly death rates in the epidemic period, referred 
to the mean excess rate as a base. 

3. Mean increase in death rate during epidemic. — As a third epi- 
demicity index we may take the quantity called M" in the preceding 
section. We then have 

h-M" 

4. Twenty-five weeks excess rate. — It has been suggested that the 
average excess weekly annual death rate for the 25 weeks ended 
March 1, 1919, might be used as a measure of the force of the epi- 
demic. Indeed, it has been so used practically by various health 
officials. In the present connection we may designate this measure 
as I v 

5. Peak-time ratio.— An epidemicity index which immediately 
makes strong appeal by virtue of its simplicity is a constant for any 
mortality curve which may be called the peak- time ratio. The sym- 
bolical expression for it is: 

. P-M' 
h- T 

where P denotes the maximum peak mortality rate observed during 
the duration T of the epidemic, T being delimited by the definition 
stated earlier in this paper, and M' is the quantity defined under the 
same symbol in section 2 above. This index increases as the explo- 
siveness of the outbreak increases. In fact, it measures explosive- 
ness in the most simple and direct way possible. 
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V, Numerical Values of Epidemicity Indices, 
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It is evident at once that these five indices have different degrees 
of validity and usefulness. Before attempting to discuss them in 
detail, however, it will be well to get the numerical values for each, 
in the case of each of the 39 cities under discussion. This is done 
in Table XIII. 

Table XIII. — Showing values of different epidemicity indices of inortqlity in American 
cities during influenza epidemic of 1918. 



Cities. 


h 

(weeks). 


h 

(per 
cent). 


h. 


h. 


h. 




1.61 
6.68 
1.54 
4.06 
1.98 
1.85 
. 2.00 
1.98 
4.55 
3.63 
3.55 
0.24 
1.66 
3.41 
3.42 
4.11 
5.60 
1.70 
1.76 
4.48 
5.98 
1.58 
5.70 
5.43 
1.69 
2.19 
5.25 
4.17 
1.52 
2.79 
2.46. 
1.33 
4.48 
4.06 
5.12 
5.06 
: 2.09 
1.67 
1.49 


85.9 
58.5 
94.5 
60.? 
88.5 
92.0 
88.9 
72.4 
69.8 
74.2 
56.4 
91.4 
80.9 
65.7 
55,9 
78.4 
62.7 
71.5 
94.7 
57.4 
55.1 
72.6 
99.0 
100.6 
90.2 
71.2 
77.0 
69.6 
86.2 
67.0 
86.4 
66.1 
79.2 
59.1 
57.8 
78.4 

■ 94.2 

■ 69.8 
66.3 


40.13 
9.31 

48.61 
17.04 
83.47 
81.19 
27.68 
24.04 
15.41 
18.30 
14.94 
24.67 
38.70 

8.10 
12.51 
15.45 
15.78 
34.60 
24.15 
11.57 

9.80 
39.39 
15.34 
18.89 
40.95 
23.29 
18.74 
18.47 
66.08 
37.62 
21.79 
35. 12 
13.94 
13.47 
11.31 
26. 50 
30.77 
17.19 
45.08 


4.7 
2.7 
6.1 


13.81 
.92 

18.61 
2.41 
9.62 

10.55 
7.94 
6.61 
2.15 
4.09 
2.74 
7.20 

11.92 
1.68 
2.15 
3.07 
2.00 

10.58 
8.60 
1.53 
1.12 

13.83 
2.81 
3.16 

14.60 
5.67 
3.35 
2.91 

20.51 
7.82 
5.60 

13.91 

■ 2.62 

2.11 

1(43 

4:49 

. ,8.97 

' 5.95 

15,34 










6.5 
5.8 
5.9 
3.8 
4.0 
4.0 
3.2 
3,5 
5.8 
1.5 
2.5 
3.6 
5.2 
5.1 


Buffalo 


























2.9 
2.7 
7.8 
5.1 
6.6 
7.2 
4.7 
5.9 




















7.3 
8.0 
.5-3 






Rochester , . 


2.7 
3.0 
3.3 

7.5 

"""2.T" 
6,6 


St. Paul.... .;......„,.....,. 


Syracuse . . . , 

Toledo 







Of these five indices there are only two which need to be taken 
seriously into account as practical working measures of epidemicity. 
These are the first and last, 7 t and 7 5 . The other three fail in that 
they do not adequately take account of the time or duration variable, 
which, as we have already seen, must be an essential factor in meas- 
uring epidemic explosiveness. These other indices really measure 
other aspects of the epidemic better than they do explosiveness of 
the outbreak, which is the thing we are just now interested in. The 
inadequacy of I v I 3 , or 7 4 to measure relative explosiveness of out- 
break can be readily seen by comparing, city by city, the values given 
in these columns of Table 2tlII with the curves for the same cities 
in Figures 1-6. 
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As between 7 t and 7 5 the advantage, for present purposes, of 7, 
is clear. It is numerically more sensitive to changes in the epidemic 
mortality curves. This fact is reflected in a comparison of the 
relative variation of the five indices which is made in Table XIV. 
For comparing the relative sensitivity of the indices to differences 
in the epidemic mortality curves, the ratio of the standard deviation 
of each index to its mean has been taken. This ratio has no signifi- 
cance in this case except for comparative purposes. 

Table XIV. — Relative sensitivity of different epidemicity indices. 



Index 


Ratio of 
S. D. to 
mean. 




0.49 

.18 
.49 
.37 
.77 


T 




j 





By conventional biometric standards it might seem a priori that 7, 
would be a better epidemicity index tban 7 6 . Practically it is seen 
from Table XIV that the superiority of 7 5 is outstanding. The reason 
for this superiority appears upon analysis to be that this index relates 
in the simplest mathematical manner possible the two essential 
factors in relative explosiveness, namely, the height of the explosion, 
and the time it required, and is therefore most sensitive to differences 
in relative explosiveness. The same type of constant might be used 
for the measure of variation in frequency curves generally, except 
for the fact that ordinarily it is impossible to delimit the range by 
absolute definition, as can be done in the case of epidemics. In an 
ordinary frequency curve the probable error of any determination 
of the range is large. The nature of the definition of the range or 
duration which we have here adopted for epidemic curves, as well as 
the characteristics of epidemic curves themselves, largely reduces this 
probable error in th<» present connection. And in any case, whatever 
effect the probable error of the empiric determination of duration 
may have will tend to be greater in the case of I t than of 7 5 . 

Taking all the facts into consideration it has been decided to adopt 
7 5 as the measure of explosiveness of outbreak in the further analytical 
study of the influenza epidemic. 

VI. The Correlation of the Explosiveness of the Outbreak of Mortality in the 
Influenza Epidemic with Various Other Factors. 

We come now to the most essential part of the study, namely, the 
attempt to find factors directly related to or concerned in the pro- 
duction of the extraordinary differences between different cities in 
respect of the relative explosiveness of the outbreak of epidemic 
mortality. The method of analysis which will be followed is that .of 
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multiple correlation. 1 The general principle of the correlation method 
is simple. If in the present case, for example, we should find that, 
in general, when a city had a high influenza epidemicity index it also 
had a high density of population, and conversely, that cities having 
low epidemicity indices had low density of population, it would be 
said that there was a positive correlation in variation between explo- 
siveness of epidemic and density of population. 

In a system of n variables correlation between any two, with the 
others remaining constant, is measured by the coefficient. 

— ^ M-Zi fa— l) ~ r m .U fa— l) ' ?V?t'34 fa — l) 

7 i2-54 n ("I— r 2 i \W1— r 3 ', 7fe 

\ x ' m.u to— v) y- ' 2re.34 to— i)/ 

and a coefficient of zero order is found from the observations by the 

following well-known expression : 

S(xy) 

In the present case, because of the statistically small number of 
cities for which data are available, the zero order coefficients were all 
determined by the direct product-moment method, without the 
formation of correlation tables. 

The first group of phenomena of which one would naturally wish 
to know the extent to which they were correlated with explosiveness 
of outbreak are certain general demographic characteristics of the 
several cities. The following will be considered: 

(a) Density of population. — It is conceivable — not to say a priori, 
rather probable— that the explosiveness of outbreak of any epidemic 
disease would be highly correlated with the number of persons living 
on a unit of area. The figures for density used were calculated in 
terms of persons per acre of land area, on July 1, 1916. 2 

(b) Geographical position.— It is a well known epidemiological fact 
that, in certain classes of epidemic disease at least, the force of the 
epidemic diminishes as one passes from the primary center or focus, 
This fact was very clearly demonstrated for the 1916 poliomyelitis 
epidemic by Lavender, Freeman, and Frost, 3 where New York City 
was the center. Now, in point of time, the influenza epidemic of the 
autumn of 1918 in the United States began in and about Boston, 
Mass. A great explosive outbreak occurred in Boston and Cam- 
bridge earlier than in any other cities in the country. We may then 
ask this question: Did the influenza epidemic, as it spread over the 
whole country, follow the epidemiological rule already referred to 
becoming less intense and less explosive the farther, geographically, 
it traveled from the Atlantic seaboard in general, and Boston in 

i Cf.. Yule, G. U. "On the Theory of Correlation," Jour. Boy. Stat. Soc., Vol. LX,1897, and "On tho 
Theory of Correlation/for any Number, of Variables, treafM by a New System of Notation,?' Proc. Boy. 
Soe. A, vol. 79, pp. 182-193, 1907. 

? Datsfffrari' " .Financial Statistics of Cities'Having a Population of over 30,000 in 1917." • Bureau of the 
Census, 1918. 

• Public Health Bulletin No. 91, XT. S. Public Health Service, 1918. 
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particular 1 To answer this question, so far as the epidemic mor- 
tality records of the present group of cities is concerned, we have cor- 
related the epidemicity index I 5 for each city with the distance in a 
straight line of the same city from Boston, Mass., measuring these 
straight line distances on a map. Such distance measurements are 
rough, of course, from an absolute standpoint, but relatively they 
are sufficiently accurate, and may be relied on, to show correlation 
if any exists. 

(c) Age distribution of population. — In the case of a disease show- 
ing so selective a mortality in respect of age as does influenza it might 
well be the case that the explosiveness of the outbreak of epidemic 
mortality would be markedly influenced by the age composition of 
the population in the several cities. To test this point by the cor- 
relation method one must have a single numerical measure or index 
of the age composition of the population in each city. Such a single 
numerical measure is not at hand. The problem of obtaining one is 
a problem which has bothered vital statisticians for a long time, 
as the need for it always arises in death rate correlation studies of 
any sort. Theoretically, of course, no single numerical expression 
can possibly be found which will uniquely describe all the properties 
of a complex curve. The best that can be done is some form of 
approximation. 

For present purposes an index of differences in age composition of 
populations was adopted, which is admittedly rough and in special 
cases may be inexact, but which practically has been found, in the 
case of the 40 cities here dealt with, to give a sufficiently accurate 
picture of the differences in age constitution. The statistical pro- 
cedure adopted was to determine for each city the following value: 



x 2 



=<¥) 



where A is the deviation for each of six age groups (viz, 0-4, 5-14., 
15-24, 25-44, 45-64, 65 and over) of the percentage of the actual 
population of each city in 1910 in each age group, from the per- 
centage in the same group in the Standard Population of Glover's ' 
Life Table, denoted in the formula by P. S denotes summation of 
all six values. The value x 2 measures through the extent to which 
each city deviates in the age constitution of its population from a 
fixed standard, but does not tell the nature or kind of the deviation. 
For present purposes the latter point is unessential. We are pro- 
posing to measure the correlation between explosiveness of epi- 
demic and departure of population from normal in age distribution. 
Are large variations in explosiveness generally associated with large 
deviations in age constitution of the population? This question can 
be answered perfectly by the use of the present index of age consti- 

' Glover, J. W. United States Life Tables, 1910. Bureau of the Census, 1916. 
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tution. If it were found that there existed a high correlation be- 
tween 2 5 and x 2 it would be desirable and necessary to analyze 
further the nature of the deviations in age constitution. But as will 
presently appear this necessity does not arise. 

As has been said, the age distributions for the cities in the year 
1910 were used. This was necessitated by the fact that no later 
census data were available. It seems fairly certain, however, in as 
old, large, and settled communities as these dealt with are, that the 
age composition of the population will only change slowly, and that 
1910 figures may be taken as reasonably indicative of present con- 
ditions in respect to this matter. 

(d) Percentage growth of population between 1900 and 1910. — It 
might conceivably be the case that the explosiveness of the outbreak 
of an epidemie disease would be influenced by the rapidity with 
which a city had grown in the recent past. To test this possible 
factor in the present case the epidemicity index I 5 is correlated with 
the percentage growth of the population in each city in the decade 
1900-1910. 

The data for these various correlations are assembled in Table XV. 

Table XV. — Data for correlation of demographic characteristics of cities with explosiveness 
of epidemic influenza mortality. 



City. 


Epi- 
demicity. 
Index h. 


Density 
of popu- 
lation 
(persons 
per acre). 


Geo- 
graphical 
position. 


Age 
distribu- 
tion x". 


Growth 
in popu- 
lation. 




13.81 
.92 

18.61 
2.41 
9.62 

10.55 
7.94 
6.61 
2.15 
4.09 
2.74 
7.20 

11.92 
1.68 
2.15 
3.07 
2.00 

10.58 
8.60 
1.53 
1.12 

13.83 
2.81 
3.16 

14.60 
5.67 
3.35 
2.91 

20.51 
7.82 
5.60 

13.91 
2.62 
2.11 
1.43 
4.49 
8.97 
5.95 

15.34 


8.89 
11.42 
30.57 

5.68 
27.36 
18.97 
28.23 
20.28 

9.10 
20.08 
15.18 
12.65 

5.91 
11.85 
10.96 
16.61 

2.40 
13.63 
12.06 
26.92 
11.27 
10.11 
27.52 
13.06 

2.96 
29.54 

6.41 

8.34 
21.02 
22.81 
22.35 
10.76 
18.62 
19.36 

7.40 
17.55 
13.34 
10.91 

9.55 


128 

920 

348 

1,028 


4.76 

13.06 
6.81 

15.80 
7.18 
8.86 
6.51 

11.45 
6.73 

11.88 
8.35 
6.56 

10.87 
6.17 
7.23 
7.57 
7.67 
7.35 

14.24 

10.33 

11.46 
9.19 

10.19 
6.81 
9.25 

11.79 
6.51 

10.83 
7.19 

11.53 
6.88 

10.55 
6.97 
9.51 

12.70 

12.65 
6.21 
7.26 
6.58 


6.5 
72.3 
9.7 

245.4 
19.6 
20.2 
14.1 
28.7 
11.6 
46.9 
44.6 
36.6 
13.8 
28.6 
38.1 
9.4 

211.5 
11.9 
28.1 
31.0 
48,7 
36.5 
41.2 
23.7 
18.1 
38.7 

124.3 
21.0 
19.7 
18.2 
27.8 
50.1 
34.2 
19.4 
31.7 
21.6 
26.6 
27.8 
18.8 










Buffalo 


376 

3 

828 

712 

532 

616 

684 

45 

720 

776 

796 

2,520 

23 

1,104 

832 

1,084 

924 

192 

100 

1,332 

164 

2,604 

1,248 

260 

456 

40 

460 

328 

1,004 

1,072 

2,624 

248 

620 

376 














Fall River 












































St. Paul 






Toledo 
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As a matter of record, and for reference in connection with the cor- 
relation data, the mean and standard deviation of the variables 
included in Table XV are given in Table XVI. 

Table XVI. — Constants for demographic data of Table XV. 



Character. 


Mean. 


Standard 
deviation, 


Density of population 

Geographical position 

Growth in population 


6.78 ± 0.56 
15.17 ± .82 
721 ±71. 00 
9.063± .28 
40.43 ±5.2 


5.22 ± 0.40 
7.56 ± .58 
653.95 ±50.00 
2.009± .20 
48.31 ±3.7 



Coming now to the consideration of the correlations we have the 
following results: 

(a) For the correlation between explosiveness of epidemic mortality 
(J 5 ) and density of population — 

r= +0.092±0.107. 

The coefficient is less than its probable error, or is, in short, sub- 
stantially zero. This value justifies the conclusion that relative 
density of population in these 39 cities had nothing to do with the 
explosiveness of the influenza outbreak. 

The insignificant degree of correlation in this case is shown graph- 
ically in Figure 14. The plan of this figure is first to convert the 
absolute values of the epidemicity index and density of population 
for each city to relative figures, the mean for all cities being taken 
as the base 100. ' The cities are then arranged in descending order of 
relative epidemicity index (solid line) and the relative density figures 
for the same cities are plotted as a broken line. The higher the 
correlation the more closely will the two lines tend to parallel each 
other. Here it is evident that the density line runs quite independ- 
ently of the epidemicity line. 

Q>) For the correlation between 7 6 and geographical position, 
measured by straight line distance from Boston 

r= -0.348 ±0.095. 

This, clearly, is a wholly different order of result from that which we 
had in the case of the density of population. The coefficient in the 
present case is nearly four times its probable error and may almost 
certainly be regarded as significant. The odds against its being 
simply a widely deviant chance result of random sampling are more 
than 78 to l. 1 The sign of the coefficient is negative. This result 
means that the greater the linear distance of a city from Boston the 

' Cf. Pearl, It., and Minor, J. R. A Table lor Estimating the Probable Significance of Statistical Con- 
stants. Me. Agr. Expt. Stat. Ann. Kept. 1914, pp. 85-88. 
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less explosive did the outbreak of epidemic mortality in that city 
tend to be. This is in accord with the general epidemiological rule 
that the force of an epidemic tends to diminish as it spreads from its 
primary or initial focus. It must be noted, however, that the correla- 
tion coefficient in this case is not large. It is barely past the value 
where it may safely be regarded as statistically significant. This 
fact may probably be taken to mean that influenza does not follow the 
epidemiological law referred to with anything like such precision as 
do some other epidemic diseases, notably poliomyelitis. 

(c) For the correlation between explosiveness of epidemic mor- 
tality (7 5 ) and the deviation of the population in the several cities 
from a standard population in respect of age distribution 

r=-0.262±0.101. 

This coefficient is only a little more than two and a half times its 
probable error, and can not safely be regarded as significant. If 
there were no correlation whatever, a value of the coefficient as 
great as the present one would be expected to occur as often as 
approximately 8 times in every 100 trials with samples of 39 each. 
In any case it is evident that the difference in age constitution of the 
population in the different cities can have had but extremely little, 
if any, influence in bringing about the observed differences in explo- 
siveness of epidemic mortality. 

(d) For the correlation between epidemicity index 7 5 and percent- 
age growth of population in the last intercensal decade 

r=- 0.327 ±0.096. 

The coefficient in this case is slightly more than 3 times its probable 
error, and is to be regarded as probably statistically significant. On 
its face the coefficient, having the negative sign, means that there is a 
definite but not pronounced tendency for cities in the 39 which made 
a relatively great percentage growth in population in 1900-1910, to 
show a relatively small explosion of influenza mortality during the epi- 
demic, and vice versa. This would seem to indicate that the epidemic 
mortality tended to be greatest in the older and larger cities and least 
in the newer and smaller cities, since the old and large cities generally 
are not now showing so high a percentage growth from year to year 
as are the younger cities. The sample of 39, however, is too small to 
warrant such a conclusion, because in so large a cou try, and one so 
relatively recently urbanized in many parts, the rate of urban popu- 
lation growth is largely bound up with distance from the Atlantic sea- 
board. The cities which showed the largest percentage increase in 
population in 1900-1910 are in general those of the middle west. 

We can get at a quantitative estimate of the matter by the method 
of multiple correlation. Letting the subscript 1 denote epidemicity 
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index 7 5 , 2 denote percentage growth of population 1900-1910, and 
3 denote geographical position measured by straight line distance 
from Boston, as before, we have for the net correlation between the 
explosiveness of epidemic mortality and rate of population growth, 
with geographical position constant 

»h-3= -0.188 ±0.104. 

It then appears that the supposition made above is substantially 
correct. This net coefficient between epidemicity index and rate of 
population growth can not be regarded as statistically significant in 
comparison with its probable error. In other words, if we make 
geographical location constant the correlation practically disap- 
pears between the other two variables. 

The general conclusion to which we come from an examination of 
the correlation data assembled to this point is that these four general 
demographic factors, density of population, geographical position, 
age distribution of population, and rate of recent growth in popula- 
tion, have practically nothing to do, either severally or collectively, 
with bringing about those differences between the several cities in 
respect of explosiveness of the outbreak of epidemic mortality in 
which we are interested. Significantly casual or differentiating fac- 
tors must be sought elsewhere. 

The next general field to which one naturally turned for correla- 
tion study was that of the normal death rates, both from all causes 
and from various particular causes, in the several cities. The death 
rate, crude or standardized, of any particular community of consid- 
erable size, is a relatively constant attribute of that community. The 
death rate does change, to be sure, with the passage of time, but only 
slowly. Over a short period of years the death rates of any large city 
will be found to be nearly constant. In so far they are definite attri- 
butes of the city, which are, in general, indicative of the normal vital 
condition of the population. It is, therefore, important to determine 
the extent which the normal mortality from various causes is corre- 
lated with the severity of the unusual and explosive mortality arising 
from a great epidemic. 

Since, at the time of writing, the mortality statistics for the regis- 
tration area and its parts have been published only up to and includ- 
ing 1916, the nearest available annual death rates, in point of time, 
to the 1918 epidemic are those for 1916. 1 Accordingly, these figures 
are used. In view of the fact already stated that for large aggre- 
gates of population, death rates normally change only very slowly, it 
is clear that we are justified in taking the 1916 rates as indicative, to 
a first approximation, of the normal general mortality conditions 

i Mortality Statistics 1916, Seventeenth Annual Report. Bureau of the Census, 1918. 
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prevailing in the several cities at about the time (in a broad sense) 
that the influenza epidemic broke out. The cause of death selected 
for correlation purposes in the first study are exhibited in Table 
XVII. For convenience of reference and comparison the epidemicity 
index I 5 , with which these death rates are to be correlated, is given 
in the second column of the table. All the death rates are crude 
rates. 



Table XVII. — Data for correlation of explosiveness of influenza epidemic mortality, 
with death rates from various causes for 1916. 





Epi- 
demic- 
ity 

index 
4 


Death 

rate 

from all 

causes 

per 

1,000. 


Death rates per 100,000 living, from— 


City. 


Pulmo- 
nary 
tuber- 
culosis. 


Organic 
heart 
dis- 
eases. 


Acute 
nephri- 
tis and 
Bright's 
disease. 


Influ- 
enza. 


Pneu- 
monia 

(all 
forms). 


Ty- 
phoid 
fever. 


Cancer. 


Meas- 
les. 




13.81 
.92 

18.61 
2.41 
9.(52 

10.55 
7.94 
6.61 
2.15 
4.09 
2.74 
7.20 

11.92 
1.68 
2.15 
3.07 
2.00 

10.58 
8.60 
1.53 
1.12 

13.83 
2.81 
3.16 

14.60 
5.07 
3.35 
2.91 

20.51 
7.82 
5.60 

13.91 
2.62 
2.11 
1,43 
4.49 
8.97 
5.95 

15.34 


19.3 
15.3 
18.1 
14.1 
16.9 
16.1 
13.5 
14.5 
16.4 
14,8 
15.5 
15.2 
17.0 
12.2 
15.6 
15.0 
12.3 
17.3 
19.8 
12.7 
12.4 
17.2 
15.0 
17.0 
18.4 
13.9 
10.5 
14.4 
16.2 
17.4 
15.8 
19.7 
14.4 
14.9 
11.8 
15.4 
15.2 
18.1 
17.8 


208. 5 
117.0 
200.5 
173.9 
145. 
142.8 
172.6 
132.8 
208.3 
132.2 
125.2 
121. 8 
161.3 

64.7 
159.6 
159.9 
176.7 
103.3 
262.1 

78.8 
117.8 
201.8 
115.5 

95.5 
259.0 
154.9 

94.2 
101. 5 
170.6 
110.7 
134.1 
187.0 

91.9 
129.0 

99.1 
169.4 

83.0 
168.1 
187.4 


235.8 
110.2 
193.2 

84.7 
220.4 
170.1 
191.2 
159. 9 
202.7 
119.6 
156.4 
180.8 
158.9 
134.8 
175.6 
145.7 
161.0 
161.6 
145. 1 
102.9 
120.0 
211.2 
153.6 
175.0 
207.4 
168.7 
189.3 

93.7 
197.4 
14-}. 7 
167.5 
189.5 
192.3 
144.6 
122.6 
250.7 
201.1 
192.8 
230.5 


197.2 
158.5 
174.3 

85.8 
102.6 
127.0 

70.8 
107.2 
158.8 

90.9 

90.3 
119.5 
105.9 

88.9 
115.0 
154.0 
111.3 

89.2 
171.1 

79.9 
101.8 
132.8 
140.9 
122.3 
231.1 
131.4 

89.1 

91.3 
177.7 

92.0 
142.4 
204.9 
136.7 
170.8 

92.6 
135.3 
112.5 

89.3 
168.1 


35.8 

14.7, 

21.5 

13.2 

11.2 

10.2 

9.7 
11.7 
26.8 
16.3 
33.5 
18.9 
24.1 

9.4 
17.4 
33.1 

9.3 
14.1 
37.0 
15.8 

8.8 
25.0 
17.4 
37.4 
26.9 

9.8 

8.6 
18.7 
24.0 
26.6 
25.9 
20.4 

8.9 
22.8 

9.3 

4.1 
10.9 
19.7 
24.2 


161.3 
141.2 
235. 7 
137.5 
210.8 
166.3 
159. 3 
158.1 
145.4 
182.2 
155.9 
146.2 
243.8 

70.2 
141.8 
146.9 

78.0 
178.4 
136.9 
154.2 
111.4 
152.6 
161.2 
225. 1 
117.3 
179.9 

75.5 
173.4 
172.2 
331.0 
174.1 
194.0 
121.6 
173.5 

80.5 
129.0 
134.3 
156.5 
164.3 


7.6 

22.0 

18.1 

43.5 

3.4 

10.9 

1.8 

5.2 

3.2 

5.3 

13.0 

19.7 

10.9 

16.4 

26.1 

13.4 

2.6 

11.5 

26.7 

15.3 

5.5 

37.1 

6.1 

8.7 

23.1 

3.9 

4.0 

3.0 

7.6 

9.0 

5.1 

23.6 

5.0 

9.4 

5.7 

3.5 

12.2 

22.2 

12.9 


120.8 
63.5 

106.7 
56.1 

115.8 

100.7 

112.4 
91.3 

116.2 
86.8 

100.5 

114.8 
91.9 
88.1 
99.4 
83.7 

105. 6 
85.7 
85.2 
92.8 
96.0 
77.6 
85.6 

116.2 
93.1 
84.5 
89.6 
90.0 

101.1 
89.8 

ioo.o 

97.0 
114.7 

95.3 

87.0 
133.1 
110. 5 

97.9 
107.7 


24.5 




1.6 




5.4 








14.5 


Buffalo 


15.8 




7.1 




5.4 




15.3 




8.9 




15.8 




1.6 


Fall River 


30.4 




2.3 




9.8 




2.1 




2.0 




25.6 




2.7 




27.7 




20.4 




.9 




25.7 




5.3 


New Orleans 


3.5 




9.9 








1.8 


Philadelphia. 


6.6 




23.7 




25.1 




26.2 




8.1 




8.8 


St. Paul 


7.3 




1.3 








33.8 




2.2 







The basic variation constants for the data of Table XVII are assem- 
bled in Table XVIII. In the last column of the table have been 
placed the values of the gross or zero order correlation coefficients 
measuring the correlation between the epidemicity index 7 6 (which 
we have adopted as the measure of the explosiveness of the outbreak 
of epidemic mortality) on the one hand, and the death rates from the 
several causes, on the other hand. 
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Table XVIII. — Mean and standard deviation for death rates from various causes. 



Cause ol death. 



Mean death 
rate. 



Standard 
deviation in 
death rate. 



Coefficient of 

correlation 

between 

epidemicity 

Index Is and 

the death 

rate from the 

specified 

cause. 



All causes i 

Pulmonary tuberculosis 

Organic heart disease 

Acute nephritis and Bright 's disease. 

Influenza 

Pneumonia (all forms) 

Typhoid ferer 

Cancer 

Measles 



15.55±0.24 

147.50±4.94 

168.29±4. 19 

127.39±4. 17 

1S.80± .96 

168; 40±5. 18 

12. 41 ±1.04 

97.07±1.62 

11.00±1.09 



-2.21±0.17 
45.73±3.49 
38.82±2. 96 
38. 57±2. 95 

8.86± .68 
47.99±3.60 

9.64± .74 
14.99±1. 14 
10.08± .77 



+0.661 ±0.061 
+ .525± .078 
+ .567± .073 
+ .507± .080 
I- .287± .099 
+ .388± .002 
+ '.176± .105 
+ . 19S± . 104 
.069± .107 



1 Death rate per 1,000; in all other cases in the table the death rate is per 1CO,COO. 

The outstanding fact which strikes one at once from this table is 
the high order of the correlation which exists between the explosive- 
ness of the outbreak of epidemic mortality in these communities and 
the normal, death rate from certain causes of death in the same 
communities. In the first four lines of the table the correlation 
coefficients range from about 6 to more than 10 times the probable 
errors. . There can be no question as to the statistical significance of 
coefficients of such magnitude. On the other hand, the remaining 
coefficients in the table are of a distinctly lower order of magnitude, 
ranging from smaller than the probable error up to three or four times 
that value. It is clear that we have here hit upon a clue as to the 
basis of the observed variation in cities in respect of explosiveness 
of epidemic influenza mortality which will repay careful examination. 

The highest correlation coefficient of all is that on the first line 
of the table, for the correlation of epidemicity index with death rate 
from all causes. The existence of this high correlation at once 
indicates that an essential factor in determining the degree of explo- 
siveness of the outbreak of epidemic influenza in a particular city 
was the normal mortality conditions prevailing in that city. In 
the group of communities here dealt with those cities which had a 
relatively high normal death rate had also a relatively severe and 
explosive mortality from the influenza epidemic. Similarly, cities 
which normally have a low death rate had a relatively low, and not 
sharply explosive, increase in mortality during the epidemic. 

It will also be noted that the correlation in the next three lines of 
the table, namely those for pulmonary tuberculosis, so-called organic 
diseases of the heart, and chronic nephritis and Bright's disease, are 
of the same order of magnitude as that between the death rate from 
all causes and the explosiveness of epidemic outbreak of influenza. 
These facts have certain aspects of general biological, and, in the 
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opinion of the writer, hygienic interest. They will, however, not be 
discussed here, save in one respect. 

Because of the potential importance of these facts, it is desirable to 
examine them with the greatest critical care. A point which occurs 
to one at once is the possibility that the observed high correlation 
between epidemicity index and pulmonary tuberculosis, organic heart 
diseases, and acute nephritis and Blight's disease, arises because of 
differences in age constitution of the population in the different cities. 
In general, it is known that the crude death rate from these causes is 
influenced, in greater or less degree, by the age constitution of the 
population. May this not be the whole, or at least the main, cause of 
the observed correlation ? Again, it has already been seen earlier in 
the paper that there is a distinct, though small, correlation between 
the geographical position of the cities studied and the explosiveness 
of the epidemic mortality. May this factor not play an important 
part in the observed correlations of the epidemicity index with the 
causes of death showing a high correlation with epidemicity index? 

The simplest and most direct method of settling these questions is 
that of multiple correlation. What is needed is to get the net cor- 
relation between the death rate from organic heart diseases, let us 
say, and epidemicity index, for a constant age distribution of the 
population and constant geographical position. In the usual ter- 
minology of vital statistics we must correct our results for age dis- 
tribution and geographical position. If we let the subscript 1 denote 
the cause of death (pulmonary tuberculosis, organic heart disease, 
or acute nephritis and Bright's disease, as the case may be); the 
subscript 2 denote the value of the measure of the explosiveness of 
the epidemic mortality, our epidemicity index I 5 ; the subscript 3 
denote geographical position, measured as before by linear distance 
from Boston; and the subscript 4 denote deviation of the population 
from a standard age distribution, the thing desired to settle the 
points raised above is the net correlation coefficient, r 12iU . 

By means of the equation already given (p 1773) these net coeffi- 
cients have been determined with the following results : 

1. Net correlation between influenza epidemicity index and death 
rate from pulmonary tuberculosis, for constant age distribution and 
geographical position, r 12 . S4 = +0.609 ±0.068 

2. Net correlation between influenza epidemicity index and death 
rate from organic diseases of the heart, for constant age distribution 
and geographical position, r 12 . 34 = +0.594 ±0.070 

3. Net correlation between influenza epidemicity index and death 
rate from acute nephritis and Bright's disease, for constant age 
distribution and geographical position, r n-u = +0.510 ±0.080 

From these results it is seen that, instead of the correlation be- 
tween the explosiveness of epidemic mortality and death rate from, the 
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diseases mentioned being due to uncorrected age and locality factors, 
the net correlations after correction has been made for these factors, 
are actually higher than were the gross, uncorrected correlations. The 
net correlation of the pulmonary tuberculosis death rate with epi- 
demicity index is the highest of the three. It has a value about 9 
times its probable error. The chances are literally billions to 1 
against this correlation being due to accident or chance. We may 
conclude that the most significant factor yet discovered in causing 
the observed wide variation amongst these 39 American cities in 
respect of the explosiveness of the outbreak of epidemic influenza 
mortality in the autumn of 1918 was the relative normal liability of 
the inhabitants of the several cities to die of one or another of 
the three great causes of death which primarily result from a 
functional breakdown of one of the three fundamental organ systems 
of the animal body, the lungs, the heart, and the kidneys. 

VII. Summary. 

In this first study the weekly mortality statistics of the influenza 
epidemic beginning in the autumn of 1918 have been analyzed in a 
preliminary way for some 39 large American cities. It has been shown 
in the first instance that there was an extraordinary degree of varia- 
tion amongst the several cities in this group of cities in respect of 
the relative degree of explosiveness of the outbreak of epidemic 
mortality. The first problem confronting the student of the epidemic 
was the analysis of this variation, to find, if possible, primary factors 
concerned in its causation. Such an analysis, by the method of mul- 
tiple correlation, appears to demonstrate that an important factor 
so far found in causing the observed wide variation amongst these 39 
American cities in respect of the explosiveness of the outbreak of 
epidemic influenza mortality in the autumn of 1918 was the magni- 
tude of the normal death rates observed in the same communities, 
particularly those death rates from pulmonary tuberculosis, diseases 
of the heart and of the kidneys. 



OBSERVATIONS ON THE FOOD OF ANOPHELES LARVAE. 

By C. W. Metz, Ph. D., Special Investigator, United States Public Health Service. 

Obviously, food is an important factor in determining the abun- 
dance and distribution of Anopheles larvae, and for this reason it is 
a factor to be considered in connection with Anopheles eradication. 
The following results are from experiments and observations made in 
an attempt to ascertain the essential food requirements of Anopheles 
larva?. At first it was intended that the analysis extend to the par- 
ticular species of animals and plants contributing to the larval food, 



